Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
π§ LLM Inference
Quantization, Attention Mechanisms, Batch Processing, KV Caching
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
20749
posts in
566.0
ms
PackInfer
: Compute- and I/O-Efficient Attention for
Batched
LLM Inference
arxiv.org
Β·
17h
ποΈ
LLM Infrastructure
Understanding LLM Inference
Engines
: Inside
Nano-vLLM
(Part 2)
neutree.ai
Β·
3d
Β·
Discuss:
Hacker News
ποΈ
LLM Infrastructure
Automating Inference Optimizations with NVIDIA
TensorRT
LLM
AutoDeploy
developer.nvidia.com
Β·
3h
ποΈ
LLM Infrastructure
Large Language Models Live in Time
lesswrong.com
Β·
7h
ποΈ
LLM Infrastructure
RoPE-LIME
:
RoPE-Space
Locality
+ Sparse-K Sampling for Efficient LLM Attribution
arxiv.org
Β·
17h
πΈοΈ
Sparse Embeddings
Mechanistic
Interpretability:
Peeking
Inside anΒ LLM
towardsdatascience.com
Β·
4d
ποΈ
LLM Infrastructure
Quantization-Aware
Distillation
ternarysearch.blogspot.com
Β·
1d
Β·
Discuss:
Hacker News
π’
BitNet
Import AI 444: LLM
societies
; Huawei makes kernels with AI;
ChipBench
importai.substack.com
Β·
8h
Β·
Discuss:
Substack
π
LLM Benchmarking
Optimized
LLM Inference
Engines
rishirajacharya.com
Β·
5d
ποΈ
LLM Infrastructure
Tutorial β What is a
variational
autoencoder
?
jaan.io
Β·
5h
Β·
Discuss:
Hacker News
π
Embeddings Optimization
AI-augmented
data quality engineering
infoworld.com
Β·
12h
π
AI Interpretability
DeepChopper
model improves RNA sequencing research by mitigating
chimera
artifacts
phys.org
Β·
1h
ποΈ
LLM Infrastructure
Recursive
Deductive
Verification: A framework for reducing AI
hallucinations
news.ycombinator.com
Β·
1d
Β·
Discuss:
Hacker News
π‘οΈ
AI Security
What Google and Microsoft
patents
teach us about
GEO
searchengineland.com
Β·
7h
π°
Search Economics
25W06
. Learning a language with the machine
z1nz0l1n.com
Β·
1d
π€
Tokenization
Finding the needle in the
logstack
: Reducing LLM context with
TF-IDF
eliseomartelli.it
Β·
4d
ποΈ
LLM Infrastructure
Main
Content ||
Math
β© Programming
jeremykun.com
Β·
23h
π³
Data Structures
Study: Platforms that
rank
the latest LLMs can be
unreliable
news.mit.edu
Β·
17h
π
LLM Benchmarking
A Note on
Flat
Abstract
Syntax
Trees
gist.github.com
Β·
3h
Β·
Discuss:
Hacker News
π
Linear Types
Expectation
and
Copysets
buttondown.com
Β·
3h
Β·
Discuss:
Hacker News
πΎ
Binary Formats
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help